Cloud Storage
Back to DuckDB Data Engineering Glossary
Cloud storage refers to a model of data storage where digital information is kept on remote servers accessed through the internet, rather than on local hard drives or physical media. Popular cloud storage services like Amazon S3, Google Cloud Storage, and Microsoft Azure Blob Storage allow users to store, retrieve, and manage large amounts of data without maintaining physical hardware. These services offer scalability, durability, and accessibility, making them crucial components of modern data architectures. Data analysts and engineers often use cloud storage to host data lakes, backup datasets, or serve as a central repository for various data processing and analytics workflows. When working with tools like DuckDB, you can directly query data stored in cloud storage using syntax like:
Copy code
SELECT * FROM read_parquet('s3://my-bucket/data.parquet');
This capability enables efficient data analysis without the need to download entire datasets locally.